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Abstract 

In this article, we investigate the competitive interaction between electrical vehicles or hybrid oil- 
electricity vehicles in a Cournot market consisting of electricity transactions to or from an underlying 
electricity distribution network. We provide a mean field game formulation for this competition, and 
introduce the set of fundamental differential equations ruling the behavior of the vehicles at the feedback 
Nash equilibrium, referred here to as the mean field equilibrium. This framework allows for a consistent 
analysis of the evolution of the price of electricity as well as of the instantaneous electricity demand in 
the power grid. Simulations precisely quantify those parameters and suggest that significant reduction of 
the daily electricity peak demand can be achieved by appropriate electricity pricing. 

I. Introduction 

Electrical vehicles (EV) and plug-in hybrid electrical vehicles (PHEV) have been recognized as natural 
components of future electricity distribution networks, known as smart grids 12, O, 0. As opposed 
to classical vehicles, EV and PHEV are equipped with batteries which can be charged or discharged by 
using a simple plug-in connector compatible with the local electricity distribution grid. Thus, EV and 
PHEV can be conceived as both energy consuming devices and mobile energy sources (H, Q, 0, Q. 
In the former case, EV and PHEV can be seen as devices straining the energy demand of energy suppliers 
and, thus, adding a new constraint to reliably distribute the electricity. In the latter case, EV and PHEV 
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can be used to store or even to transport the energy from one geographical area to another and then to 
increase the reliability of the energy supply in certain zones or time intervals. 

In this framework, it is therefore an important economical and social challenge to enforce charge and 
discharge policies to EV and PHEV in an optimal manner. Here, optimality must be interpreted in the 
sense of individual revenue obtained by the EV and PHEV owners when participating in the energy 
trades and also in terms of reliability of the energy supply process to the fixed consumers. In this paper, 
we consider that a way to improve reliability is to allow EV and PHEV to buy and sell energy to or 
from the smart grid, as in a classical Cournot competition (8). Clearly, the price at which the energy is 
sold and bought depends on the existing demand in the grid and also on the demand and offer resulting 
from all the vehicles connected to the network. This competitive interaction resulting from the energy 
trade, given a global price, can be analyzed using tools from dynamic game theory (9). This is studied 
for instance in ifTUll . where a noncooperative game is played among a number of PHEV groups aiming 
to sell part of their stored energy to the smart grid; an algorithm based on best response dynamics is 
then proposed to allow PHEV groups to reach a Nash equilibrium. 

Nonetheless, in practical scenarios, the number of vehicles might be drastically large so that finite 
dimensional game theory analysis might not necessarily bring enough insight about the global behavior 
of the market. To overcome this problem, in this paper, we study the energy trade when the number of 
vehicles tends to infinity and all vehicles are considered alike, following the paradigm of ifTTI . lfT2l . More 
precisely, we shall model this interaction as a mean field game lTT3l . lfl4l . In contrast to finite games, 
where each player follows the evolution of the state of the game and the actions taken by all other players 
in order to maximize a given individual benefit, in the mean field game formulation, players do not react 
to actions from individual players but rather to the aggregate behavior of all players. The notion of 
(Nash) equilibrium in the context of mean field games is known as mean field equilibrium (MFE). When 
focusing only on the class of regular functions of time and battery levels, a necessary condition for the 
MFE is to be the solution of a coupled system of partial differential equations (PDE) which includes a 
(backward) Hamilton-Jacobi-Bellman (HJB) equation and a (forward) Fokker-Planck-Kolmogorov (FPK) 
equation. 

The closest contribution to our specific problem setting is lTT5l . ifToll . Therein, a mean field game 
approach to the study of oil production is developed. In ITT31 , the selfish players are oil producers and 
the mean field variable is the oil selling price. In this article, we develop a similar framework as in |[T31 
but on a finite time horizon, applied to both EV and PHEV, with vehicle owners as the selfish players 
and electricity price as the mean field variable. 
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The reminder of this article unfolds as follows. In Section JH we describe the problem formulation in 
the case where only electrical vehicles interact with the energy market. Therein, the problem is formulated 
as a continuous time differential game with finite time horizon. This formulation is then written under the 
form of a mean field game and the differential equations describing the MFE are presented. In Section 
Hm the same analysis presented for EV is carried out for the case of PHEV. In Section |IVJ we provide 
numerical simulations and derive conclusions for both scenarios. Finally, in Section |Vl we conclude this 
work. 



II. Electrical Vehicles 

A. System Model 

Consider a finite set % = {1, ...,K} of EVs participating to energy trading with an underlying 
electricity distribution network. The consumption rate of vehicle k G % at time t € [0, T] is denoted 

(k) (k) 

by g t . This consumption rate is measured in units of electricity per time. We assume that g\ is 
deterministic and known by EV k. The amount of energy stored in the battery of vehicle k at time t is 

(k) (k) (k) 

denoted by x\ € [0, lj, quantified in energy units. Here, x t = for an empty battery and x t = 1 

(k) 

for a fully charged battery. We denote by a t the energy provisioning rate of vehicle k at time t, that 

(k) (k) (k) 

is, the rate at which vehicle k buys or sells its energy. We relate the variable x t to g\ and a t by 
the following differential equation 

d (fc) (fc) (k) n , 
—x\>=a\'-g\\ (1) 

where and g^ are chosen such that the trajectory x^ is unique for a given initial x^ and that, 
for all t, < x^ < 1. Such a[ k ^ is called an admissible provision rate. 

In the following, we denote x t = (x\ , . . . , x[ ) and a t = (a| , . . . , a[ K ) the battery level profile 
and provisioning rate profile at time t, respectively. Consider now a predefined period [0, T}. We denote 
x (k) = ja^o < t < T} and a (fe) = {af\o < t < T} the trajectories of the battery level and 
provisioning rates for EV k, respectively. We also denote x = {x t , < t < T} and a = {a t , < t < T} 
the trajectories of the battery level and provisioning rate profiles. We finally denote A K the set of all 
admissible provision rates ex. 

The price at which vehicles either sell or buy electricity at time t is determined by the function 
Pt : — > H, oct ^ Pt{oL t ). The time dependency of the price p t models a realistic dynamic pricing 
policy accounting for the energy demand for other services than EV battery loading. This function can be 
tuned to create incentives for EV to sell or buy energy at specific time periods. In addition to electricity 
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price, other factors influence the energy trades of EV owners. We model the latter, for player k, by the 

(k) (k) / 

following set of functions. The function h\ . R — > R, a i— > h\ (a) models the (psychological) cost for 
player k to buy or sell electricity at rate a at time t. Indeed, EV owners are more likely to trade energy 
at some convenient time intervals, e.g. during nighttime when the EV is parked at home. The function 
/ t : [0, 1] — >■ R, x h4 ft{x) models the cost for vehicle k to possess only a fraction x of energy 
reserves at time t. For instance, during periods of high energy consumption, the interest of EV owners 
is to have maximally loaded batteries. Finally, «W : [o, 1] — >• R, x h-> k^(x) models the cost for EV k 
to end the trade period [0, T] with a fraction x of battery load. This function guarantees that EV owners 
do not sell all their battery content at the end of the trade. A comprehensive discussion on the choices 
of these functions is considered in Section |IV] 

The goal of EV k is to determine the consumption rates that minimize its total cost : A K — > % 
(aW,.--,a (if) ) -> Jfc(a (fc) ,a ( ~ fc) ), over a time window [0,T] given the consumption rates aS k ^ chosen 
by all the other EVs. That is, 



for a given initial state ajo. Note importantly that the instantaneous global price pt{ot-t) is a function of 
the instantaneous provisioning rate profile a t , which in return depends both on the instantaneous energy 
reserve profile x t and on the initial energy reserve profile xq. 

In the following, we formulate a differential game which models the interactions between the active 
EVs in the system. 

B. Classical Game Formulation 

We model the energy trades resulting from the interactions among the electrical vehicles and the smart 
grid by a K -player continuous-time differential game of pre-specified fixed duration T > 0. Let %, the 
set of EV, be the set of players. The state of the game, at time t, is determined by the energy reserve 
profile Xt = (x^\ ■ ■ ■ ,x[ K ^), whose trajectory x is determined by the initial state xq and, through the 
players' control, by the state evolution equation (Q]). The cost function of player k is defined by ([2]). The 
objective of player k is to determine a control trajectory that minimizes its cost. At instant t, the 

(k) 

instantaneous control a\ is determined based on the information available to player k, which we denote 
by the information set rj t . We will consider here that the information set corresponds to the singleton 
77^ = {x^}. That is, players are assumed memoryless as they do not remember the previous individual 
states nor their previous instantaneous controls. 
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In the following, we describe the strategy of player k, that is, the mapping from the individual 
information set to the space of individual controls. Let us denote the strategy of player k by the mapping 
7 t ^ : r]^ — > Af k , x i — y l[ k \x) with Af k the set of admissible controls for player k at time t. Given the 
nature of the information sets, this strategy can be referred to as a non-anticipative (own-state) feedback 
strategy. The action of player k is therefore described as 

af)= 7 f)(xf). (3) 

(k) 

In the following, we will mostly use the notation a t , implicitly assuming the existence of a mapping 
7t , and we denote Af = A^ x x . . . x Af K , Af k C Af k , the class of feedback strategies at time t. The 
notation A K C A will be used for the class of all if-vector feedback controls {Af, < t < T}. We 
recall that the interdependence between players in this game appears through the electricity price: the 
individual control a[ k ^ depends on the global price pt(ctt), which depends itself on all the other players' 
individual controls a[ k \ 

The formulation of the game is completed by further imposing that both the deterministic function 
gt and the corresponding strategies 7W, . . . ,7^, with 7^) = {7^ : < t < T}, are such that the 
trajectory defined by the initial value xo and the differential equation £T|) is well defined and unique. 

Following the above game formulation, we consider as equilibrium notion the own-state feedback Nash 
equilibrium, which we define as follows. 

Definition 1: The control profile a* = (a*^ 1 ), . . . , a*^ K >\ € A K is an own-state feedback Nash 
equilibrium (NE) if, for all k G and for all admissible control a = (qW,...,^^) £ A K , it 
holds that 

J k (a< k \ < J fc (a( fc ),a^')), (4) 

with a* t ^ = l^ix*^), a[ k ^ = ^^(x^), k G X, and x*,x 4 satisfying the state evolution (Q]), for a 
common initial state xq. 

Our interest in the NE lies in the fact that, at a state of NE, all the EV use a control policy, from 
which they have no reason to depart. Nonetheless, analyzing the NE of such a game, where K is greater 
than one is a difficult problem. In fact, even if a NE exists, it would lead to solutions that are inherently 
difficult to exploit. In particular, it is clear that, under this formulation, any change in the battery level 
of a given player impacts all other players which must react as a consequence. We aim at reducing this 
complexity by adopting some additional, but reasonable, conditions. 
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C. Mean Field Game Formulation 

In this section, we simplify the equilibrium analysis of the differential game presented in the previous 
section by considering the following hypotheses: (i) the set of players is sufficiently large to be considered 
infinite, and (ii) players are indistinguishable, in the sense that a different player labelling leads to the 
same joint state distribution. The first assumption is tenable here as we analyze a large population of 
EVs. The second assumption reflects the fact that all players have, to some extent, similar batteries and 
similar individual objectives (but obviously different battery states). 

From the assumptions of player indistinguishability and under the large K limit setting ifTTl , we can 
drop the player indexes in the previous notations and model the (battery) states of the players at time 
t by a random variable x t with distribution m(t,x). As such, m(t,x) is the limiting distribution of the 
empirical distribution m K (t,x) defined as 

1 K 

m K {t,x) = ^JX?W 
k=l 

Now, in order to avoid the unrealistic assumption that all vehicles consume energy at the same rate at 
any time instant, we model the EV consumption rate by the stochastic process g t dt + g t o~tdWt, with Wt a 
Brownian motion. The state evolution of x% is therefore described by the following stochastic differential 
equation (SDE) 

dx t = a t dt - g t (dt + a t dW t ) + dN t , (5) 

with xq € [0, 1] (now seen as a random variable) having distribution tuq = m(0, ■). The term dN t 
is a reflective variable to ensure that xt remains in [0,1]. Similar to above, we will assume that all 
conditions are met for such a trajectory xt to be well-defined. Now, under the assumption of player 
indistinguishability, the analysis of the game reduces to the study of the trajectory of the individual state 
and individual control of a single player game (or equivalently, of a stochastic control problem), with 
cost function J : A — > R, a h-> J (a), with A the set of all controls {a*,0 < t < T} admissible for the 
state dynamics ((5]), defined as 

J (a) = E (a t pt{m t ) + h t (a t ) + f t (x t ))dt + k(x t ), (6) 
J o 

for a given initial (xo,mo), where m t = m(t,-) is the distribution of the players among all individual 
states, and xt satisfies the dynamics (f5]). The control at is a feedback control that can be seen as the 
image at = 7t(^t) of the (own-state) feedback strategy j t '■ f]t — * -A-t , % l— > lt( x ) on tne information set 
f]t = {xt}- The set of such controls is denoted At, and the set of control profiles {at,0 < t < T} is 
denoted A C A. 
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In this context, the energy trading price writes p t : M 4 — > R + , m t h-> pt[m t ), with M t the class 
of distributions m*. The price can now be seen as a function of the total instantaneous EV demand 
Jq 1 a t mt(x). However, for computational ease, we will instead consider that prices are fixed not by the 
total EV consumption atmt(dx) but by the expected consumption g t + ^(/q xmt(dx)), where both 
quantities only differ by an additional Brownian motion term when a t > 0. In practice, this suggests that 
the energy regulators which set the instantaneous prices do not have the information on the instantaneous 
demand at time t but know the distribution m t at time t (as we will see, this information is accessible 
in anticipation at time t = 0). We therefore define p t as 



Pt(m t ) = D(t,-) 1 (dt + -^J xm t (dx] 

where D(t,p) is the total energy demand function (including both EV and external trades) at time t 
for a given price p, and the inverse is with respect to composition. Under the above assumptions, the 
continuous time differential game discussed in Section ITl-B I becomes a mean field game as introduced in 

in, m. 

D. Mean Field Equilibrium 

Our interest now is to transpose the notion of own-state feedback NE into the corresponding notion 
of equilibrium in the mean field game, namely the own-state feedback mean field equilibrium (MFE). 
Based on Definition [TJ we state the following definition. 

Definition 2: The control a* € A is a mean field equilibrium in (own-state) feedback strategies if, for 
all a £ A consistent with m*, it holds that 

J(a*;m*)<J k {a;m*), (7) 

where J (•; m*) denotes J (•) with m replaced by m* in its expression, m* being the distribution induced 
by the mean field equilibrium a* for the dynamics (f5]) and for a given initial state distribution tuq. 
Let us define the value function v : [0,T] x [0, 1] — > R, (u,y) — > v(u,y), as follows, 

i>(ii,y) = inf_E / (a t p t (m t ) + h t (a t ) + f t (x t )) dt + k(x t ) 
aeA Uu 

where x t is any solution to ((5]) with x u = y. 

According to |fT6l , an MFE a* for the game that generates a regular couple (v, m) must be a solution 
to the following (backward) Hamilton- Jacobi-Bellman equation 

d t v(t,x) = - inf {ad x v(t,x) + ap t (m*) + h t (a t ) + ft(x t )} 

+ g t d x v(t, x) - erf #L W (*> x ) ( 8 ) 
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where m(t, •)* = m* is the solution of the following (forward) Fokker-Planck-Kolmogorov equation 

d t m(t, x) = -d x [(a£ - g t )m(t, x)] + -g 2 a1d 2 xx m(t, x) (9) 

for given m(0, •). 

In the following, we assume the cost h{at) for control quadratic, i.e. 

ht(a) = ^H t a 2 , 

with H t >0 representing the unwillingness of the car owner to buy or sell energy at time t. This choice is 
seemingly non-natural as it implies that users are more willing to buy or sell small quantities rather than 
large quantities of energy. Nonetheless, under the mean field game formulation, this has to be understood 
as the fact that, on average, only a limited population of users at time t is willing (or able) to buy energy. 
As such, intuitively, making the (psychological) cost of buying or selling energy larger for larger amounts 
of energy forces only part of the population to buy or sell. As for the particular choice of a quadratic 
cost rather than any other cost function, it is convenient for calculus mostly. 
Under this assumption, solving 

inf {ad x v(t,x) + ap t (m*) + h t (a t ) + ft(xt)} 

for all t, it is immediate by convexity arguments to see that the optimal trajectory a* is explicitly given 
by 

a* = -±-[d x v(t,x)+p t {m*)], (10) 
tit 

possibly subject to some boundary conditions to ensure that x t € [0, 1] at all times. In the remainder of 
the article, we will assume this condition always met, so that at no time we will consider EV owners 
with completely full or completely empty batteries. 
The HJB equation now becomes 

= d t v(t, x) - (Jj- [d x v(t, x) + p t (m$)] + g^j d x v(t, x) 
[d x v(t,x)+p t (m* t )] + f t (x) 



Hi 



+ ^ [d x v(t,x)+p t (mt)] 2 + ^g 2 d 2 xx v(t,x), 



which can be simplified as 



d t v(t,x) = -^-(d x v(t,x) +p t (m*)) 2 + g t d x v{t,x) 
~ ft(x) ~ ^ 2 g 2 d 2 x v(t,x) 
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and the FPK equation is 

dtm(t, x) 



[d x v*(t,x) + p t (m(t,x)] + g t d x m(t,x 



+ jf t dl x v*(t,x)m(t,x) + -glaldl x m(t,x) 



This defines the two fundamental differential equations to be solved, either explicitly or numerically, for 
determining the MFE. 

In the next section, we improve the EV framework by turning the purely electrical vehicles into PHEV, 
introducing therefore the possibility for players to select between two alternative sources of energy. 



III. Plug-in Hybrid Vehicles 

A. System Model 

In this section, we consider that vehicles in the set % are PHEV. A PHEV can operate both with an 
electrical energy source and an alternative energy source, for instance oil. The PHEV interacts with the 
electricity distribution grid by trading electricity with an elastic price, while trading oil at a fixed price 
(which is a natural assumption on a daily or even weekly basis). We describe the energy reserves of PHEV 
k by the two-dimensional vector zf^ = (z[ k ^ , z^}) 1 <G [0, l] 2 , where is the amount of energy stored 
in the batteries and Z2,t the level of the oil tank. We denote the provisioning rates of electricity and oil 
of PHEV k by fi[ k J G R and fi^J G R, respectively. In addition, we denote : R+ x [0, l] 2 -> [0, 1], 
(t, z) I-?- /3( fc )(i, z), with z = (zi, Z2), the function that determines the relative proportion of energy drawn 
from the batteries of PHEV k at time t. Typically, taking (3^ k \t,z) = z\j(z\ + Z2) translates a policy 
where energy is consumed indistinctly of the energy source. Note that, depending on the typical distances 
covered by PHEV owners at time t (e.g. weekdays against weekends), ^ h \t, zf^) may explicitly depend 
on t. Alternatively, we may have considered (3^ k \t, z^) an additional control variable which can be set 
optimally by the car owner depending on the status of the energy market. Nonetheless, for simplicity 
of analysis, we do not consider this scenario here. We relate the variables z\ k \ fi[ k \ and (3^ by the 
following state evolution dynamics 



(*) 



(k) 
(k) 



(3(t,z 



l-(3(t,z, 



9t 



(k) 



(11) 



(k) 

and, similar to previously, we consider only p functions and n\ controls which are admissible, in the 

(k) 

sense of their defining a unique solution z\ for each t, k. 
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We then define the cost of PHEV k in the time window [0, T] as 

L k (^,^)=j o T (r t (^\...,^) + ft «^) + ^(4* , ))d*+e w (4* ) ). d2) 

for a given initial state zq £ [0, 1] 2K , i.e. the initial energy reserves of all PHEV, where fi = (/x^ 1 ) , . . . , ^ K '), 
with /jl^ = {(Jit = (a*i t , M2 1 )> < i < belonging to the set of admissible controls for the dynamics 

CD. 

Here, r t : R 2K — >■ R 2 , /x 1-4 , • • • , = (ri,t(/if \ . . . , Mi* 3 WO^ 5 ,...,fif ] )) evaluates 

the instantaneous prices r\± of electricity and r%,t of oil, given the controls /j,^ = /4^)- ^ n 

particular, we assume here that the price for oil is fixed, given by r2,t(^4 • • • > M2 ) = r 2- Note that in 
this case the trajectory of the state z = {z t = (Zf , . ■ ■ , z\ ; ), < t < T} is determined by the initial 
state zq = (z^\ . . . , Zq K ^) and by the dynamics (fTTT) . We denote 2. the set of state trajectories z. 

The function : R 2 — > R, h4 (/z) evaluates the psychological cost of trading a quantity //1 
of electricity and a quantity ^2 of oil at time t, where [i = (^i,/^) 7 . The function : [0, l] 2 — > R, 
2 h-> s| s ) denotes the cost for PHEV k to be in state z = {z%, Z2) at time t. Finally, ^ : [0, l] 2 — > R, 
2 h-» C C 2 ) i s tne cost f° r PHEV to be in state z = {z\,z-i) at time T. These are analogous to the 
functions h[ , f£° , and in (f2]), respectively. 

In the following, we formulate the finite-number of players differential game. 

B. Classical Game Formulation 

The interaction between all PHEVs is modeled by a if-player continuous-time stochastic differential 
game of pre-specified fixed duration T > 0. As for the case of EV, the aim of player k is to determine 
the control trajectory /j,^ = , < t < T} such that its cost in (fl2l ) is minimized given 

the initial conditions zq and the control trajectories adopted by all the other players ^~ k \ We denote 
the set of all admissible controls of player k over the time period [0, T] by life, and we denote 

It = Hi x • • • x Uk- At time t, the instantaneous control is determined based on the information 

(ft) 

available to player k, which we denote by the information set rj\ , as in the previous section. Here, 
the information set corresponds to the singleton r] t = {z t }. Let us denote the strategy of player 
k by Of. : rj^ — > life, T)t — > Ot^iVt^)- As stated above, this strategy corresponds to the class of 
non-anticipative own-state feedback strategies, and we will write 

^ ] = e\ k \4 k) ). (13) 

The image of 6\ , i.e. the set of own-state feedback controls, is denoted and we write IX = Hi x 
. . . x U K . 
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C. Mean Field Game Formulation 

In this section, we proceed similarly to Section III-CI We use here a finite-game counting measure m 
of the form 

1 K 



m 



k=l 



with z = (zi, Z2) T , and we assume asymptotic player indistinguishability to ensure that it admits a weak 
limiting distribution m(t, •) as K — > oo. As previously, the individual state of each player is assumed to 
be a noisy version of the deterministic state trajectory in (ITTb determined by the following SDE, 



dz t 



A*i,t 

A*2,i 



dt 



9t 



+ a t dW t + dN t 



}-P(t,zt) 

for a given initial state zq. In particular, Wt = {W\j, W2 : t) T is a two-dimensional Brownian motion 
with independent components and N t is the associated reflection vector. Similar to the EV scenario, a t 
determines the variance of the noise at time t. The analysis of the game now reduces to the analysis of 
the behavior of a single player. The cost function = L, assumed identical to all players, reads 



L(fj,,m) = e[ (r t {m t ) + q t (n t ) + st(z t ))dt + £(z T ), 
Jo 



(15) 



where m t = m(t, •) £ Mt is the distribution of the state variable z t and M t is the set of distributions at 
time t. The initial state condition is zq G [0, l] 2 , a random variable with distribution tuq. The price for 
electricity is given by the function n 4 : — > H_|_, with 

d 



r ht (m t ) = D(t,-) 



-i 



gt 



[0,1] 



f}(t, z)m t (z)dz + 



dt 



[0,1] 



z\mt(z)dz 



(16) 



for z = (z\, Z2Y in the integrals. The price for oil is constant, given by r2,t = ^2- 
The next section is dedicated to determining the MFE for this game. 



D. Mean Field Analysis 

Under the above game formulation, the optimal control problem which represents the equilibrium of 
the game formulates as 



u(0,z ) 



inf L (p, mo) 
fj.eU 



dz t = 




dt- 










l-P(t,Zt) 



g t [dt + atdWt] + dN t 



(17) 
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We introduce the value function 



v(u, y) = inf E 



! {r t (mt) + q t (nt) + st(«t)) d* + £(z T ) 
y « 



(18) 



with initial value z u = y. 

As in the EV case, we consider the cost function as quadratic, that is, 

9t(A*t) = ^<5i,t(^i,t) 2 + ^Q2,t(^2,t) 2 , 

with (Qi,t,Q 2 ,t) € E 2 . 

The HJB equation, which provides a necessary condition for the existence of an MFE generating 
regular couples (v,m), is here given by 



, inf N , {Ml,* r M ( m * ) + ^2,^2 + % (/**) 
(A*i,t,M2,t)6R 2 



- 9ft;(t, z) = 

+(lJ.i,t-gtP(t,z))d Zl v(t,z) 

+ (p2,t + 9t(f3 (t,z)-l))d Z2 v(t,z)} 
+f t (z) + ±a?gn(P(t,z)) 2 d 2 ZiZi v(t,z) 
+2/3(t,z)(l-(3(t,z))d 2 z v(t,z) 
+(l-(3(t,z)) 2 d 2 Z2Z2 v(t,z)], 
where m* = {m£, < t < T}, m* = m(t, ■)*, is solution to the FPK equation 
d t m (t, z) = -d Zl [(jilt - p (t, z) g t )m (t, z)] 

-d Z2 [(»l t + {P(t,z)-l)g t )m{t,z)] 



1 2 2 



P (t, zf d 2 ZiZ m (t, z) + (l-fi (t, z)fd\ 2Z m (t, z) 



+2P(t,z)(l-f3(t,z))d 2 ZiZ m(t,z)] , 

with /ij = (//* t ,/i2,t) e the cost minimizing (z t -adapted) feedback control, determined by 

1 



Mi,t 



M 2 ,t 



<2i,t 
1 



2,t 



(ri,t (m?) + 9 Zl «(t,z)) 
(r 2 + 5 22 i;(t,z)). 



(19) 



(20) 
(21) 



Assuming cr t = 0, we obtain more compact forms. In particular, after substitution of the expression of 
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/j,*, the HJB equation becomes 

d t v (t, z) 



' (d Zl v(t,z)+r ht (m*)) 2 



2Qi,t 
1 

+ 



(d Z2 v(t,z) + r 2 y 



where m\ is solution to 



d t m (t, z) 



2Q 2) t 

+g t /3{t,z)d Zl v{t,z) 
+g t (l-P(t,z))d Z2 v(t,z)-f t (z). 



^~{d ziz y) + ^-(d Z2Z2 v*) + g t [d Zl fi (t, z) - d Z2 (3 (t, z)\ 
Vli V2,t 



(22) 



m (i, z) 



+ 



+ 



1 



Qi,t 
l 



(n,t(m (i, z)) + + /3 (t, z) g< 



d Zl m(t,z) 



(r 2 + d ta v*) + (l-P(t,z))gt 



d Z2 m(t,z) 



(23) 



with v* the solution to ((22l . which is our final expression. Note in particular that, for f3(t,z) = Z *+ Z2 , 
z = (z±, Z2) T , which we will use in Section [IV], we have that 

1 



d Zl (3(t,z)-d Z2 {3(t,z) 



Z\ + z 2 



(24) 



IV. Simulations 



In this section, we provide simulation results for the electrical vehicle schemes developed in Section 
Eland Section Hn 



A. EV analysis 

We first consider the scenario of Section El We assume a realistic three-day scenario (t = at midnight 
the first day and t = T = 1 seventy-two hours later) where players have an average consumption rate that 
depends on specific periods of the days. The scenario is typical of a Friday to Sunday energy consumption, 
with higher overall electricity consumption on Friday and different patterns of car usage on Friday than 
on Saturday and Sunday. Since it is difficult to provide a universal system parametrization, we will take 
arbitrary scalings in the energy consumption functions. 

The car electricity consumption function g t is depicted in Figure [TJ where we see in particular that 
consumption is higher on Friday and with a peak around 5pm, while consumption is lower on weekend 
days with different peak times. The variance of on the consumption is taken equal to 0.01 at all time, 
ensuring a standard deviation of the order of 10%. The demand function D(t,p) is such that the price p 
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is a quadratic function of the total electricity demand from both electrical vehicles and other electricity 
services. Specifically, we take here 

d 



Pt 



+ dt 



where dt stands for the demand of electricity in services other than electrical cars, with [x] + = max(x, 0). 
We therefore assume that this demand is deterministic and is not altered by the evolution of EV electricity 
price, which is a realistic assumption if the EV electricity market is independent of the outer electricity 
trading market. The function d t is depicted in dashed line in Figure [4] up to a constant corresponding to 
the total average EV consumption; that is, the dashed line represents the total electricity consumption if 
EV consumption were distributed equally in time. For simplicity of understanding, we assume h t = 30 
constant; that is, we do not consider that the car owners have any particular incentive to charge or 
discharge at some specific time periods 3 We take f(t,x) = (1 — x) 2 to impose consumers to keep a 
certain level of electricity in their batteries, and the boundary condition k(x) = (1 — x) 2 in order to avoid 
large sales at the last minute. The initial condition on m(0, •) is a triangle distribution uiq centered at 0.5 
and with support [0.3,0.7]. The boundary conditions on m and v are such that d x m(0, •) = d x m(l, •) = 
d x v (0, •) = d x v(l, •) = in order to force the energy content to lie in [0, 1]. 

To solve the system of equations ([U), © in (m,v), we proceed by solving sequentially the HJB and 
FPK equations using a simple fixed-point algorithm until convergence. We do not ensure here that this 
algorithm does converge, neither do we ensure that the solution obtained is the solution sought for. 
Using a finite difference method on a sampling of 144 points in the time axis (every 30min) and of 
100 points in the battery level axis, the above scheme leads to the distribution evolution m* depicted in 
Figure |2] A few observations can be already made from this figure. We easily observe daily sequences 
of increases and decreases of the average battery levels. We see in particular that during nighttime, the 
battery levels increase, indicating that energy is purchased in nighttime and consumed during daytime. 
It is interesting to note that, due to the small variance a 2 that was chosen, the overall tendency is for 
m*(t, •) to concentrate into a single mass when t — > 1. This is a usual phenomenon which determines 
the steady state if time were to continue with constant values for all time-dependent system parameters. 

From the expression of m*, v*, and the equations derived in Section [TTJ it is now possible to obtain 
much information about the system. In particular, it is interesting to follow the electricity bought or sold 

'Note that the determination of a correct h t is highly subjective and is better kept constant for the sake of interpretation. 
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by electrical vehicles at all time, that is the quantity 



gt + -j- I xm*(t,x)dx 



dt 

or the overall electricity consumption in the market given by 

d f 

gt + — xm*(t, x)dx + d t 
dt J 

and the price Pt{m*) defined here as 



Pt{mt) 



9t + li /' xm *( i ' 2; ) rfx 




This is depicted in Figure [3j Figure [4] and in Figure [5j respectively. 

We see first in Figure [3j that the peaks of electricity bought by electrical vehicles take place during 
the night where the overall demand is low, while they are at their lowest during peak demand periods. 
This is a natural outcome of the fact that prices are high during peak demand periods. However, we also 
see that the difference of amplitude between lowest and highest purchases is not large. This is due to 
the fact that, while prices are high in peak demand periods, the EV owners still have a strong incentive 
not to find their batteries empty, driving them to keep buying electricity at peak periods. This behaviour 
can be hindered by relaxing the constraint f(t,x). 

Of more interest is Figure [4] where the differences between electricity consumption with or without 
incentives on EV behaviour is presented. This figure depicts in dashed line the overall energy consumption 
if the EV purchases were equally distributed in the three-day period (that is, with no incentive), and in 
plain line the overall consumption under our current assumptions. It is seen here that the price incentives 
on electricity purchases produce a much expected peak demand reduction in the critical day periods, 
and a simultaneous increase of consumption during low consumption periods. Note importantly that our 
analysis does not consider changes in d t when the price for electricity changes; only the part of electricity 
reserved for EV drives prices which in turn drive the EV behaviour, which is a natural assumption if 
different price conditions are applied to EV and other services. The price evolution is depicted in Figure 
|5j where it is seen in this setting that the price is mostly driven by the function dt- 



B. PHEV analysis 

In this second section, we wish to analyze the behavior of hybrid vehicles as described in Section 
Hill Since solving three-dimensional differential equations is time-consuming, we only provide results for 
the time scale discretized in 12 samples and for the "spatial" scales discretized both in 16 samples. For 
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Fig. 1. Mean energy consumption g t of EV as a function of time. 




6 12 18 24 30 36 42 48 54 60 66 
Time t (in hours) 



Fig. 2. Density solution m*(t,x) as a function of the time t and the battery level x. 



each differential equation, the resolution is performed by iterating the resolution of the two-dimensional 
differential equations along time and electricity scales for each fixed oil tank level, and time and oil 
scales for each fixed battery level. Then the system of HJB and FPK differential equations is solved 
by further iterating a fixed point algorithm as in the previous section. For simplicity of interpretation, 
we consider here a time-independent scenario where both g t = 0.2 and (qi,t,Q2,t) = (125,125) are 
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Fig. 3. Electricity purchased by EV as a function of the time t. 




Fig. 4. Total electricity consumption with or without EV regulation as a function of the time t. 



constant with timeo We take the electricity price to be r\ >t = {D(t, + 0.5, where now the demand 

is solely due to the electricity being bought by PHEVs; that is, we do not consider other sources of 
electricity consumption in order to focus on the oil/electricity interaction solely. The oil price is set to 
i"2,t = i~2 = 0.7. This is a natural choice as it is expected that an approximate quantity g t = 0.2 will 

2 Such a large value for the entries of h t is motivated by faster algorithm convergence reasons, although it inhibits as a 
counterpart fast variations of m along time. 
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Fig. 5. Evolution of the price pt(m*) as a function of the time t. 

be asked for at any time to cover for the energy consumed, hence a price for electricity r\ t t — 0.7. We 
impose a constraint st(z) = 20(2 — z\ — z?) 2 , where z = (z\,Z2) Y ■ The relative consumption fj of oil 
and electricity is proportional to the total quantity of energy, that is f3(t, z) = Z\/(z\ + z^) and therefore 
1 — /3(t,z) = Zil(z\ + z-i). We take a t = for simplicity. The boundary constraints are identical to 
those in the previous section. As for the terminal constraint on v, it imposes that v(T,z) = £(z) = 
10(2-(z! + z 2 )) 2 . 

We consider the scenario where m(0, •) is a (properly truncated and scaled) Gaussian distribution with 
mean (0.4, 0.6) T and covariance 0.02/2, with I2 the 2x2 identity matrix. That is, we assume that, initially, 
most vehicles have more oil than electricity. This is depicted in Figure [6] We then let the system evolve 
freely under the above set of constraints. It is natural to guess that the overall behavior is a decrease of 
either or both quantities of oil and electricity to zero if the prices are too high, or an increase of either or 
both quantities to one, if the prices are more reasonable. What is interesting to observe is the trajectory 
jointly followed by the players. The resulting final distribution m*(T, ■) is depicted in Figure |7J What we 
observe in the aforementioned conditions is that the initial distribution has shifted towards an increase of 
both electricity and oil levels, with a stronger increase of the mean battery level. Another observation is 
that the distribution tends to stretch along the z\ = Z2 diagonal in the figure, translating the fact that oil 
and electricity are seen almost as equivalent goods due to the loosely constraining energy cost policy. 

Among the different further analyses, in Figure [8j we consider a section of the distribution of the 
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Fig. 6. Initial distribution m(0, ■) at time t = 0, as a function of both levels of battery and oil tank. 

optimal transaction policy fj,* t and [i\ t at time t = + , for Z2,t = 0.5 and z%^% = 0.9 (we remind that 
both [i\ t and \i\ t are functions of t, Z\ t and Z2,t)- That is, we observe the initial behavior of players 
with half-filled oil tanks and almost completely filled oil tanks. It is seen that, for users with a very low 
level of electricity, buying electricity is an appealing choice. This can be interpreted by the fact that, as 
few players are in strong need for energy, it is possible to acquire a large quantity of electricity at a 
reasonable price. Those players with low reserves of electricity are the main beneficiaries. For users with 
already a reasonable level of electricity though, electricity and oil are seen as equivalent goods. As a 
matter of fact, our results also show that, at time t = + , the price of electricity equals r\ t t = 0.706 ~ r^. 
That is, the players with low electricity levels draw as much of the electricity overhead (compared to oil) 
as is needed to reach an equilibrium price with oil. Now, it is also observed that, for users with large 
quantities of oil, electricity becomes a compelling purchase in order to further increase the total quantity 
of energy (since / imposes z\ t + Z2,t to be close to 2), hence a larger incentive for buying electricity 
when the battery level is not large. When both battery and tank levels are alike, we see that the quantity 
of electricity purchased is the same as the quantity of oil purchased. 

Obviously, from the very generic settings of both EV and PHEV problems, many more scenarios can 
be carried out so to evaluate the actual impact of the EV and PHEV on realistic smart grid scenarios. The 
simulations above and their interpretations only provide a framework of fully rational vehicle owner's 
behavior, which needs be reported to real-life conditions with extreme care. 
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Fig. 7. Final distribution m(T, ■) as a function of both levels of battery and oil tank. 
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Fig. 8. Optimal transactions at time t = + for players with different oil and battery levels. 

V. Conclusion 

In this article, we proposed a game theoretical framework to model the behavior of electrical vehicle 
and hybrid electricity-oil vehicle owners aiming at selfishly minimizing their operating cost. As the 
number of selfish players is large, and players are assumed alike, we then turned the problem into a 
mean field game, for which we obtain the fundamental differential equations describing the mean field 
equilibrium of the game. Using numerical methods, we drew conclusions which give new insights on the 



December 13, 2011 



DRAFT 



21 



way to optimize the electrical vehicle penetration in the future smart grid. 
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